Conversation
Summary of ChangesHello @claudio525, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the workflow by introducing a robust framework for generating custom virtual simulation site grids. This new capability allows users to define specific criteria for site selection, ensuring consistency across different simulation sets while maintaining the flexibility for tailored analyses. The change also involves the removal of an outdated 'merge timeslices' utility, simplifying the overall system architecture. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds comprehensive support for generating custom virtual site grids for earthquake simulations. The main purpose is to enable creation of domain-specific site grids by sub-sampling a high-density "general" grid, ensuring consistency across simulation sets while allowing customization. Additionally, this PR removes the deprecated merge timeslices functionality.
Key changes:
- New
site_gen.pymodule implementing custom grid generation with configurable filters (land-only, regional bounds, uniform spacing, basin-specific spacing) - CLI commands in
site_gen_cmds.pyfor grid generation and visualization - Removal of merge timeslices stage and related code
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| workflow/site_gen.py | Core implementation of general and custom grid classes with filtering capabilities |
| workflow/scripts/site_gen_cmds.py | Command-line interface for grid generation and plotting |
| workflow/scripts/plan_workflow.py | Removed deprecated MergeTimeslices stage identifier |
| workflow/scripts/merge_ts_loop.pyx | Deleted Cython implementation for timeslice merging |
| workflow/scripts/merge_ts.py | Deleted merge timeslices script |
| workflow/init.py | Removed merge timeslices documentation reference |
| wiki/Stages.md | Removed merge timeslices stage documentation |
| wiki/Site-Generation.md | Added comprehensive documentation for site generation feature |
| setup.py | Deleted setup file for Cython extensions |
| pyproject.toml | Removed merge-ts console script entry point |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces a significant new feature for generating custom site grids, while also cleaning up by removing the old merge-ts functionality. The new site generation code is well-structured, but I've found several critical issues related to coordinate transformations that will cause incorrect geographical plotting of domains and sources. There are also some high-severity correctness issues in the grid-masking logic and inconsistencies between the implementation and the new documentation. Addressing these points will be crucial for the feature to work as intended and be understood by users. I've also included some medium-severity suggestions to improve code quality and usability.
|
These LLM comments look mostly reasonable. I'll take a pass at it once they are resolved. |
|
See also: the pytest coverage checks are failing. You need some tests for your code. |
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - basin: The basin the site is in (if any). | ||
| - Z1.0: The Z1.0 value for the site (if vel_model_version is provided). | ||
| - Z2.5: The Z2.5 value for the site (if vel_model_version is provided). | ||
| - site_code: The site code. |
There was a problem hiding this comment.
The docstring is missing documentation for the 'Vs30' column. According to line 701, 'Vs30' is loaded from the NZGMDB data for real sites, and this column is referenced in the plotting code (site_gen_cmds.py lines 169, 177, 199, 207). The documentation should include: - Vs30: Time-averaged shear-wave velocity in the top 30 meters (m/s) (for real sites only).
| - site_code: The site code. | |
| - site_code: The site code. | |
| - Vs30: Time-averaged shear-wave velocity in the top 30 meters (m/s) (for real sites only). |
|
@lispandfound Can you have a look at those failing type checks. They are in files that I haven't touched. Its a bit odd that the type checker is failing on those, but was fine for the PR that added the type checking? This branch is up to date with the |
|
@claudio525 See #72. It is failing tests due to a bug in zarr that I have opened an issue for upstream. The failures you see occur because the |
| how="left", | ||
| predicate="within", | ||
| ) | ||
| # Keep first basin if in multiple basins |
There was a problem hiding this comment.
This could be an interesting point to make here. When I discussed this issue with Robin he had certain priority basins when a point was in multiple. There might be a way to do this with ordering the list in priority so that the first one is always the basin that should be used etc.
https://uceqeng.slack.com/archives/C019XU80PJ4/p1757991060818839
| include_sigma=False, | ||
| logger=logger, | ||
| ) | ||
| site_df["Z1.0"] = z_values["Z1.0(km)"] |
There was a problem hiding this comment.
I think this would overwrite all the Z1.0 values from the NZGMDB (line 740) if both of these options are selected, a nzgmdb version and a vel_model_version was provided
|
@claudio525 bumping this because I think it will be useful so I am happy to help get it over the finish line. Should we do backarc calculations here too? |
Adds support for generating custom site grids. See the Wiki page for details.